Resilient Decentralized Consensus

Seminar, Project

Consensus is one of the fundamental principles that we can build on when designing distributed systems: Whenever a group of processes establishes consensus, all processes agree on the current state of the system. This allows building distributed systems in which a group of processes acts as one coherent state machine, also referred to as state-machine replication.

Since consensus algorithms (such as Paxos or Raft) can become rather complex, most applications rely on external tools such as databases or dedicated replication managers to achieve consensus. This limits application flexibility and makes it hard to optimize the consensus algorithm towards a specific application scenario.

Goal

When picked as a seminar, the goal of this topic is to get an overview of different consensus algorithms and compare them, especially with regards to their system assumptions. Which algorithm works best in which setting, e.g., data centers versus peer-to-peer applications?

When picked as a project, the goal of this topic is to implement several algorithms and develop a library of distributed consensus algorithms which can be embedded directly into a distributed program. This allows programmers to select application-specific algorithms which are fine-tuned for a specific use case.

The implementation of the algorithms should follow the principle of immutable replicated state (as seen in append-only log stores or CRDTs) to simplify reasoning and decrease complexity. Furthermore, it should focus on modularity and abstractions, such that several basic building blocks can be combined in multiple ways to achieve different algorithms.

Starting Points

Example Tool